Search CORE

42 research outputs found

Introduction to Streaming Data Analytics and Applications Minitrack

Author: Kantardzic Mehmed
Zurada Jozef
Publication venue: AIS Electronic Library (AISeL)
Publication date: 01/01/2017
Field of study

Crossref

ScholarSpace at University of Hawai'i at Manoa

AIS Electronic Library (AISeL)

Sliding Reservoir Approach for Delayed Labeling in Streaming Data Classification

Author: Hu Hanqing
Kantardzic Mehmed
Publication venue: AIS Electronic Library (AISeL)
Publication date: 04/01/2017
Field of study

When concept drift occurs within streaming data, a streaming data classification framework needs to update the learning model to maintain its performance. Labeled samples required for training a new model are often unavailable immediately in real world applications. This delay of labels might negatively impact the performance of traditional streaming data classification frameworks. To solve this problem, we propose Sliding Reservoir Approach for Delayed Labeling (SRADL). By combining chunk based semi-supervised learning with a novel approach to manage labeled data, SRADL does not need to wait for the labeling process to finish before updating the learning model. Experiments with two delayed-label scenarios show that SRADL improves prediction performance over the naïve approach by as much as 7.5% in certain cases. The most gain comes from 18-chunk labeling delay time with continuous labeling delivery scenario in real world data experiments

ScholarSpace at University of Hawai'i at Manoa

AIS Electronic Library (AISeL)

Don’t Pay for Validation: Detecting Drifts from Unlabeled data Using Margin Density

Author: Kantardzic Mehmed
Sethi Tegjyot Singh
Publication venue: The Authors. Published by Elsevier B.V.
Publication date: 31/12/2015
Field of study

AbstractValidating online stream classifiers has traditionally assumed the availability of labeled samples, which can be monitored over time, to detect concept drift. However, labeling in streaming domains is expensive, time consuming and in certain applications, such as land mine detection, not a possibility at all. In this paper, the Margin Density Drift Detection (MD3) approach is proposed, which can signal change using unlabeled samples and requires labeling only for retraining, in the event of a drift. The MD3 approach when evaluated on 5 synthetic and 5 real world drifting data streams, produced statistically equivalent classification accuracy to that of a fully labeled accuracy tracking drift detector, and required only a third of the samples to be labeled, on average

Elsevier - Publisher Connector